Load all MoE experts during warmup and make warmup 1 token #198

saood06 · 2025-02-09T22:23:09Z

First commit is a port of: ggml-org/llama.cpp#11571

The second commit is based on what fairydreaming has reported here ggml-org/llama.cpp#11733 and also unify's warmup to always be one token.

This allows warmup to actually warmup an MoE model as all experts are exercised.

Co-authored-by: Stanisław Szymczyk <[email protected]>

ikawrakow

LGTM, but it does nothing on the single socket computers I have currently available, so relying on the comments in the linked PR and issue that this really improves things on NUMA systems.

saood06 · 2025-02-10T14:52:48Z

LGTM, but it does nothing on the single socket computers I have currently available, so relying on the comments in the linked PR and issue that this really improves things on NUMA systems.

The first commit, should work on any system to help MoE loading (Deepseek is the most noticeable because of it's large size and expert count but it should help all MoE) . It is only the the second commit is designed to benefit NUMA systems.

…kawrakow#198)"

saood06 and others added 2 commits February 9, 2025 15:32

Load all MoE experts during warmup

ca4e8e5

Co-authored-by: Stanisław Szymczyk <[email protected]>

Unify warmup to one token

3702743

ikawrakow approved these changes Feb 10, 2025

View reviewed changes

ikawrakow merged commit a366a3d into main Feb 10, 2025

Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Mar 3, 2025

Revert "Load all MoE experts during warmup and make warmup 1 token (i…

93e66a6

…kawrakow#198)"

This was referenced Jul 23, 2025

Bug: The streaming every couple of rows blocks for 5-8s #464

Closed

Fix pauses after a comma #639

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Load all MoE experts during warmup and make warmup 1 token #198

Load all MoE experts during warmup and make warmup 1 token #198

Uh oh!

saood06 commented Feb 9, 2025

Uh oh!

ikawrakow left a comment

Uh oh!

saood06 commented Feb 10, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Load all MoE experts during warmup and make warmup 1 token #198

Load all MoE experts during warmup and make warmup 1 token #198

Uh oh!

Conversation

saood06 commented Feb 9, 2025

Uh oh!

ikawrakow left a comment

Choose a reason for hiding this comment

Uh oh!

saood06 commented Feb 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

saood06 commented Feb 10, 2025 •

edited

Loading